Overview

Brought to you by YData

Dataset statistics

Number of variables16
Number of observations558837
Missing cells123376
Missing cells (%)1.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory418.1 MiB
Average record size in memory784.4 B

Variable types

Numeric5
Text8
Categorical3

Alerts

color is highly overall correlated with transmissionHigh correlation
mmr is highly overall correlated with odometer and 2 other fieldsHigh correlation
odometer is highly overall correlated with mmr and 2 other fieldsHigh correlation
sellingprice is highly overall correlated with mmr and 2 other fieldsHigh correlation
transmission is highly overall correlated with colorHigh correlation
year is highly overall correlated with mmr and 2 other fieldsHigh correlation
transmission is highly imbalanced (88.9%)Imbalance
interior is highly imbalanced (50.4%)Imbalance
make has 10301 (1.8%) missing valuesMissing
model has 10399 (1.9%) missing valuesMissing
trim has 10651 (1.9%) missing valuesMissing
body has 13195 (2.4%) missing valuesMissing
transmission has 65352 (11.7%) missing valuesMissing
condition has 11820 (2.1%) missing valuesMissing

Reproduction

Analysis started2024-11-14 00:29:36.153523
Analysis finished2024-11-14 00:29:51.329888
Duration15.18 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

year
Real number (ℝ)

HIGH CORRELATION 

Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.0389
Minimum1982
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.3 MiB
2024-11-13T19:29:51.364772image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1982
5-th percentile2002
Q12007
median2012
Q32013
95-th percentile2014
Maximum2015
Range33
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.9668636
Coefficient of variation (CV)0.0019735258
Kurtosis1.0105063
Mean2010.0389
Median Absolute Deviation (MAD)2
Skewness-1.1832259
Sum1.1232841 × 109
Variance15.736007
MonotonicityNot monotonic
2024-11-13T19:29:51.425568image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
2012 102315
18.3%
2013 98168
17.6%
2014 81070
14.5%
2011 48548
8.7%
2008 31502
 
5.6%
2007 30845
 
5.5%
2006 26913
 
4.8%
2010 26485
 
4.7%
2005 21394
 
3.8%
2009 20594
 
3.7%
Other values (24) 71003
12.7%
ValueCountFrequency (%)
1982 2
 
< 0.1%
1983 1
 
< 0.1%
1984 5
 
< 0.1%
1985 10
 
< 0.1%
1986 11
 
< 0.1%
1987 8
 
< 0.1%
1988 11
 
< 0.1%
1989 20
 
< 0.1%
1990 49
< 0.1%
1991 67
< 0.1%
ValueCountFrequency (%)
2015 9437
 
1.7%
2014 81070
14.5%
2013 98168
17.6%
2012 102315
18.3%
2011 48548
8.7%
2010 26485
 
4.7%
2009 20594
 
3.7%
2008 31502
 
5.6%
2007 30845
 
5.5%
2006 26913
 
4.8%

make
Text

MISSING 

Distinct96
Distinct (%)< 0.1%
Missing10301
Missing (%)1.8%
Memory size33.3 MiB
2024-11-13T19:29:51.514272image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length13
Median length11
Mean length5.9952236
Min length2

Characters and Unicode

Total characters3288596
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowKia
2nd rowKia
3rd rowBMW
4th rowVolvo
5th rowBMW
ValueCountFrequency (%)
ford 94001
17.1%
chevrolet 60587
 
11.0%
nissan 54017
 
9.8%
toyota 39966
 
7.3%
dodge 30956
 
5.6%
honda 27351
 
5.0%
hyundai 21837
 
4.0%
bmw 20793
 
3.8%
kia 18084
 
3.3%
chrysler 17485
 
3.2%
Other values (54) 165367
30.0%
2024-11-13T19:29:51.661778image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 328819
 
10.0%
e 300106
 
9.1%
a 235580
 
7.2%
r 230200
 
7.0%
d 215895
 
6.6%
n 186317
 
5.7%
i 184908
 
5.6%
s 178494
 
5.4%
t 128322
 
3.9%
l 116781
 
3.6%
Other values (39) 1183174
36.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2626386
79.9%
Uppercase Letter 643142
 
19.6%
Dash Punctuation 17160
 
0.5%
Space Separator 1908
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 328819
12.5%
e 300106
11.4%
a 235580
9.0%
r 230200
8.8%
d 215895
8.2%
n 186317
 
7.1%
i 184908
 
7.0%
s 178494
 
6.8%
t 128322
 
4.9%
l 116781
 
4.4%
Other values (15) 520964
19.8%
Uppercase Letter
ValueCountFrequency (%)
C 95605
14.9%
F 94447
14.7%
M 67959
10.6%
N 57170
8.9%
H 49827
7.7%
B 43083
 
6.7%
T 40759
 
6.3%
D 30713
 
4.8%
I 22822
 
3.5%
W 20719
 
3.2%
Other values (12) 120038
18.7%
Dash Punctuation
ValueCountFrequency (%)
- 17160
100.0%
Space Separator
ValueCountFrequency (%)
1908
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3269528
99.4%
Common 19068
 
0.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 328819
 
10.1%
e 300106
 
9.2%
a 235580
 
7.2%
r 230200
 
7.0%
d 215895
 
6.6%
n 186317
 
5.7%
i 184908
 
5.7%
s 178494
 
5.5%
t 128322
 
3.9%
l 116781
 
3.6%
Other values (37) 1164106
35.6%
Common
ValueCountFrequency (%)
- 17160
90.0%
1908
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3288596
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 328819
 
10.0%
e 300106
 
9.1%
a 235580
 
7.2%
r 230200
 
7.0%
d 215895
 
6.6%
n 186317
 
5.7%
i 184908
 
5.6%
s 178494
 
5.4%
t 128322
 
3.9%
l 116781
 
3.6%
Other values (39) 1183174
36.0%

model
Text

MISSING 

Distinct973
Distinct (%)0.2%
Missing10399
Missing (%)1.9%
Memory size33.7 MiB
2024-11-13T19:29:51.796329image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length29
Median length23
Mean length6.7668214
Min length1

Characters and Unicode

Total characters3711182
Distinct characters66
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)< 0.1%

Sample

1st rowSorento
2nd rowSorento
3rd row3 Series
4th rowS60
5th row6 Series Gran Coupe
ValueCountFrequency (%)
altima 19432
 
2.9%
series 15429
 
2.3%
grand 14928
 
2.2%
f-150 14527
 
2.2%
1500 14476
 
2.2%
fusion 13639
 
2.0%
camry 13515
 
2.0%
escape 12027
 
1.8%
focus 10463
 
1.6%
g 9333
 
1.4%
Other values (740) 531345
79.4%
2024-11-13T19:29:52.006476image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 376164
 
10.1%
r 276815
 
7.5%
e 269750
 
7.3%
o 195221
 
5.3%
n 184979
 
5.0%
i 170339
 
4.6%
s 149945
 
4.0%
t 136207
 
3.7%
l 132705
 
3.6%
C 123260
 
3.3%
Other values (56) 1695797
45.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2572552
69.3%
Uppercase Letter 715154
 
19.3%
Decimal Number 255786
 
6.9%
Space Separator 120676
 
3.3%
Dash Punctuation 46890
 
1.3%
Other Punctuation 124
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 376164
14.6%
r 276815
10.8%
e 269750
10.5%
o 195221
 
7.6%
n 184979
 
7.2%
i 170339
 
6.6%
s 149945
 
5.8%
t 136207
 
5.3%
l 132705
 
5.2%
u 121554
 
4.7%
Other values (16) 558873
21.7%
Uppercase Letter
ValueCountFrequency (%)
C 123260
17.2%
S 103749
14.5%
E 61270
8.6%
F 56534
 
7.9%
A 50623
 
7.1%
M 41654
 
5.8%
T 37328
 
5.2%
G 34766
 
4.9%
R 32269
 
4.5%
X 25588
 
3.6%
Other values (16) 148113
20.7%
Decimal Number
ValueCountFrequency (%)
0 93846
36.7%
5 59031
23.1%
3 31209
 
12.2%
1 30666
 
12.0%
2 13966
 
5.5%
4 10821
 
4.2%
6 8705
 
3.4%
7 4110
 
1.6%
9 2483
 
1.0%
8 949
 
0.4%
Other Punctuation
ValueCountFrequency (%)
/ 117
94.4%
& 7
 
5.6%
Space Separator
ValueCountFrequency (%)
120676
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 46890
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3287706
88.6%
Common 423476
 
11.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 376164
 
11.4%
r 276815
 
8.4%
e 269750
 
8.2%
o 195221
 
5.9%
n 184979
 
5.6%
i 170339
 
5.2%
s 149945
 
4.6%
t 136207
 
4.1%
l 132705
 
4.0%
C 123260
 
3.7%
Other values (42) 1272321
38.7%
Common
ValueCountFrequency (%)
120676
28.5%
0 93846
22.2%
5 59031
13.9%
- 46890
 
11.1%
3 31209
 
7.4%
1 30666
 
7.2%
2 13966
 
3.3%
4 10821
 
2.6%
6 8705
 
2.1%
7 4110
 
1.0%
Other values (4) 3556
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3711182
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 376164
 
10.1%
r 276815
 
7.5%
e 269750
 
7.3%
o 195221
 
5.3%
n 184979
 
5.0%
i 170339
 
4.6%
s 149945
 
4.0%
t 136207
 
3.7%
l 132705
 
3.6%
C 123260
 
3.3%
Other values (56) 1695797
45.7%

trim
Text

MISSING 

Distinct1963
Distinct (%)0.4%
Missing10651
Missing (%)1.9%
Memory size32.6 MiB
2024-11-13T19:29:52.178900image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length46
Median length37
Mean length4.7364088
Min length1

Characters and Unicode

Total characters2596433
Distinct characters72
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique241 ?
Unique (%)< 0.1%

Sample

1st rowLX
2nd rowLX
3rd row328i SULEV
4th rowT5
5th row650i
ValueCountFrequency (%)
base 56122
 
8.3%
se 48401
 
7.2%
s 30313
 
4.5%
lx 21388
 
3.2%
limited 20583
 
3.1%
lt 20224
 
3.0%
2.5 18864
 
2.8%
xlt 18796
 
2.8%
ls 17932
 
2.7%
sport 17603
 
2.6%
Other values (963) 402195
59.8%
2024-11-13T19:29:52.431059image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
L 215038
 
8.3%
S 206498
 
8.0%
e 155209
 
6.0%
i 135238
 
5.2%
E 127061
 
4.9%
124236
 
4.8%
T 120884
 
4.7%
a 108839
 
4.2%
r 97790
 
3.8%
X 91551
 
3.5%
Other values (62) 1214089
46.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1061836
40.9%
Uppercase Letter 1032530
39.8%
Decimal Number 302684
 
11.7%
Space Separator 124236
 
4.8%
Other Punctuation 52125
 
2.0%
Dash Punctuation 21013
 
0.8%
Math Symbol 1865
 
0.1%
Open Punctuation 72
 
< 0.1%
Close Punctuation 72
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 155209
14.6%
i 135238
12.7%
a 108839
10.3%
r 97790
9.2%
t 79239
7.5%
s 74067
 
7.0%
u 63188
 
6.0%
o 61772
 
5.8%
m 50958
 
4.8%
n 46407
 
4.4%
Other values (16) 189129
17.8%
Uppercase Letter
ValueCountFrequency (%)
L 215038
20.8%
S 206498
20.0%
E 127061
12.3%
T 120884
11.7%
X 91551
8.9%
B 58430
 
5.7%
G 33516
 
3.2%
V 30193
 
2.9%
P 28362
 
2.7%
C 20784
 
2.0%
Other values (15) 100213
9.7%
Decimal Number
ValueCountFrequency (%)
5 73106
24.2%
2 56765
18.8%
3 50679
16.7%
0 48184
15.9%
1 18385
 
6.1%
4 14525
 
4.8%
8 14233
 
4.7%
6 13271
 
4.4%
7 13229
 
4.4%
9 307
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 48974
94.0%
/ 2623
 
5.0%
! 463
 
0.9%
' 29
 
0.1%
& 26
 
< 0.1%
: 10
 
< 0.1%
Space Separator
ValueCountFrequency (%)
124236
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21013
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1865
100.0%
Open Punctuation
ValueCountFrequency (%)
( 72
100.0%
Close Punctuation
ValueCountFrequency (%)
) 72
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2094366
80.7%
Common 502067
 
19.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 215038
 
10.3%
S 206498
 
9.9%
e 155209
 
7.4%
i 135238
 
6.5%
E 127061
 
6.1%
T 120884
 
5.8%
a 108839
 
5.2%
r 97790
 
4.7%
X 91551
 
4.4%
t 79239
 
3.8%
Other values (41) 757019
36.1%
Common
ValueCountFrequency (%)
124236
24.7%
5 73106
14.6%
2 56765
11.3%
3 50679
10.1%
. 48974
 
9.8%
0 48184
 
9.6%
- 21013
 
4.2%
1 18385
 
3.7%
4 14525
 
2.9%
8 14233
 
2.8%
Other values (11) 31967
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2596433
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 215038
 
8.3%
S 206498
 
8.0%
e 155209
 
6.0%
i 135238
 
5.2%
E 127061
 
4.9%
124236
 
4.8%
T 120884
 
4.7%
a 108839
 
4.2%
r 97790
 
3.8%
X 91551
 
3.5%
Other values (62) 1214089
46.8%

body
Text

MISSING 

Distinct87
Distinct (%)< 0.1%
Missing13195
Missing (%)2.4%
Memory size32.8 MiB
2024-11-13T19:29:52.501826image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length23
Median length5
Mean length5.2792729
Min length3

Characters and Unicode

Total characters2880593
Distinct characters48
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowSUV
2nd rowSUV
3rd rowSedan
4th rowSedan
5th rowSedan
ValueCountFrequency (%)
sedan 248760
42.1%
suv 143844
24.3%
cab 33137
 
5.6%
hatchback 26237
 
4.4%
minivan 25529
 
4.3%
coupe 19983
 
3.4%
crew 16394
 
2.8%
wagon 16180
 
2.7%
convertible 10933
 
1.9%
g 9333
 
1.6%
Other values (33) 40608
 
6.9%
2024-11-13T19:29:52.635378image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 398031
13.8%
e 352374
12.2%
n 338855
11.8%
S 338282
11.7%
d 262219
 
9.1%
V 124796
 
4.3%
U 119292
 
4.1%
C 78559
 
2.7%
b 77463
 
2.7%
s 73869
 
2.6%
Other values (38) 716853
24.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2092151
72.6%
Uppercase Letter 741046
 
25.7%
Space Separator 45296
 
1.6%
Dash Punctuation 1874
 
0.1%
Decimal Number 226
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 398031
19.0%
e 352374
16.8%
n 338855
16.2%
d 262219
12.5%
b 77463
 
3.7%
s 73869
 
3.5%
c 70363
 
3.4%
u 69821
 
3.3%
i 64724
 
3.1%
v 62002
 
3.0%
Other values (12) 322430
15.4%
Uppercase Letter
ValueCountFrequency (%)
S 338282
45.6%
V 124796
 
16.8%
U 119292
 
16.1%
C 78559
 
10.6%
M 21867
 
3.0%
H 21380
 
2.9%
W 13672
 
1.8%
G 7777
 
1.0%
E 5359
 
0.7%
R 4068
 
0.5%
Other values (9) 5994
 
0.8%
Decimal Number
ValueCountFrequency (%)
6 78
34.5%
0 78
34.5%
3 32
14.2%
7 32
14.2%
4 6
 
2.7%
Space Separator
ValueCountFrequency (%)
45296
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1874
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2833197
98.4%
Common 47396
 
1.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 398031
14.0%
e 352374
12.4%
n 338855
12.0%
S 338282
11.9%
d 262219
 
9.3%
V 124796
 
4.4%
U 119292
 
4.2%
C 78559
 
2.8%
b 77463
 
2.7%
s 73869
 
2.6%
Other values (31) 669457
23.6%
Common
ValueCountFrequency (%)
45296
95.6%
- 1874
 
4.0%
6 78
 
0.2%
0 78
 
0.2%
3 32
 
0.1%
7 32
 
0.1%
4 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2880593
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 398031
13.8%
e 352374
12.2%
n 338855
11.8%
S 338282
11.7%
d 262219
 
9.1%
V 124796
 
4.3%
U 119292
 
4.1%
C 78559
 
2.7%
b 77463
 
2.7%
s 73869
 
2.6%
Other values (38) 716853
24.9%

transmission
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct4
Distinct (%)< 0.1%
Missing65352
Missing (%)11.7%
Memory size35.0 MiB
automatic
475915 
manual
 
17544
sedan
 
15
Sedan
 
11

Length

Max length9
Median length9
Mean length8.8931356
Min length5

Characters and Unicode

Total characters4388629
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowautomatic
2nd rowautomatic
3rd rowautomatic
4th rowautomatic
5th rowautomatic

Common Values

ValueCountFrequency (%)
automatic 475915
85.2%
manual 17544
 
3.1%
sedan 15
 
< 0.1%
Sedan 11
 
< 0.1%
(Missing) 65352
 
11.7%

Length

2024-11-13T19:29:52.702158image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-13T19:29:52.751992image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
automatic 475915
96.4%
manual 17544
 
3.6%
sedan 26
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a 986944
22.5%
t 951830
21.7%
u 493459
11.2%
m 493459
11.2%
o 475915
10.8%
i 475915
10.8%
c 475915
10.8%
n 17570
 
0.4%
l 17544
 
0.4%
e 26
 
< 0.1%
Other values (3) 52
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4388618
> 99.9%
Uppercase Letter 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 986944
22.5%
t 951830
21.7%
u 493459
11.2%
m 493459
11.2%
o 475915
10.8%
i 475915
10.8%
c 475915
10.8%
n 17570
 
0.4%
l 17544
 
0.4%
e 26
 
< 0.1%
Other values (2) 41
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4388629
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 986944
22.5%
t 951830
21.7%
u 493459
11.2%
m 493459
11.2%
o 475915
10.8%
i 475915
10.8%
c 475915
10.8%
n 17570
 
0.4%
l 17544
 
0.4%
e 26
 
< 0.1%
Other values (3) 52
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4388629
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 986944
22.5%
t 951830
21.7%
u 493459
11.2%
m 493459
11.2%
o 475915
10.8%
i 475915
10.8%
c 475915
10.8%
n 17570
 
0.4%
l 17544
 
0.4%
e 26
 
< 0.1%
Other values (3) 52
 
< 0.1%

vin
Text

Distinct550297
Distinct (%)98.5%
Missing4
Missing (%)< 0.1%
Memory size39.4 MiB
2024-11-13T19:29:52.955820image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length17
Median length17
Mean length16.999685
Min length9

Characters and Unicode

Total characters9499985
Distinct characters35
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique541970 ?
Unique (%)97.0%

Sample

1st row5xyktca69fg566472
2nd row5xyktca69fg561319
3rd rowwba3c1c51ek116351
4th rowyv1612tb4f1310987
5th rowwba6b2c57ed129731
ValueCountFrequency (%)
automatic 22
 
< 0.1%
wbanv13588cz57827 5
 
< 0.1%
wp0ca2988xu629622 4
 
< 0.1%
1ftfw1cv5afb30053 4
 
< 0.1%
wddgf56x78f009940 4
 
< 0.1%
5uxfe43579l274932 4
 
< 0.1%
5n1ar1nn2bc632869 4
 
< 0.1%
trusc28n241022003 4
 
< 0.1%
1hgcp3f8xca021624 3
 
< 0.1%
1n6aa06b06n500808 3
 
< 0.1%
Other values (550287) 558776
> 99.9%
2024-11-13T19:29:53.250835image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 919859
 
9.7%
2 636870
 
6.7%
3 612473
 
6.4%
5 595561
 
6.3%
4 574947
 
6.1%
0 498899
 
5.3%
6 487522
 
5.1%
7 458866
 
4.8%
8 455044
 
4.8%
c 381531
 
4.0%
Other values (25) 3878413
40.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5616389
59.1%
Lowercase Letter 3883596
40.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 381531
 
9.8%
a 362723
 
9.3%
f 282719
 
7.3%
d 282571
 
7.3%
b 268419
 
6.9%
e 243626
 
6.3%
g 231727
 
6.0%
n 191831
 
4.9%
k 157970
 
4.1%
h 149920
 
3.9%
Other values (15) 1330559
34.3%
Decimal Number
ValueCountFrequency (%)
1 919859
16.4%
2 636870
11.3%
3 612473
10.9%
5 595561
10.6%
4 574947
10.2%
0 498899
8.9%
6 487522
8.7%
7 458866
8.2%
8 455044
8.1%
9 376348
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 5616389
59.1%
Latin 3883596
40.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 381531
 
9.8%
a 362723
 
9.3%
f 282719
 
7.3%
d 282571
 
7.3%
b 268419
 
6.9%
e 243626
 
6.3%
g 231727
 
6.0%
n 191831
 
4.9%
k 157970
 
4.1%
h 149920
 
3.9%
Other values (15) 1330559
34.3%
Common
ValueCountFrequency (%)
1 919859
16.4%
2 636870
11.3%
3 612473
10.9%
5 595561
10.6%
4 574947
10.2%
0 498899
8.9%
6 487522
8.7%
7 458866
8.2%
8 455044
8.1%
9 376348
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9499985
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 919859
 
9.7%
2 636870
 
6.7%
3 612473
 
6.4%
5 595561
 
6.3%
4 574947
 
6.1%
0 498899
 
5.3%
6 487522
 
5.1%
7 458866
 
4.8%
8 455044
 
4.8%
c 381531
 
4.0%
Other values (25) 3878413
40.8%

state
Text

Distinct64
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.4 MiB
2024-11-13T19:29:53.334554image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length17
Median length2
Mean length2.0006979
Min length2

Characters and Unicode

Total characters1118064
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)< 0.1%

Sample

1st rowca
2nd rowca
3rd rowca
4th rowca
5th rowca
ValueCountFrequency (%)
fl 82945
14.8%
ca 73148
13.1%
pa 53907
 
9.6%
tx 45913
 
8.2%
ga 34750
 
6.2%
nj 27784
 
5.0%
il 23486
 
4.2%
nc 21845
 
3.9%
oh 21575
 
3.9%
tn 20895
 
3.7%
Other values (54) 152589
27.3%
2024-11-13T19:29:53.473091image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 199889
17.9%
n 110349
9.9%
l 108648
9.7%
c 108264
9.7%
f 82971
 
7.4%
t 68644
 
6.1%
m 60888
 
5.4%
p 56632
 
5.1%
i 54410
 
4.9%
o 50032
 
4.5%
Other values (26) 217337
19.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1117805
> 99.9%
Decimal Number 259
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 199889
17.9%
n 110349
9.9%
l 108648
9.7%
c 108264
9.7%
f 82971
 
7.4%
t 68644
 
6.1%
m 60888
 
5.4%
p 56632
 
5.1%
i 54410
 
4.9%
o 50032
 
4.5%
Other values (16) 217078
19.4%
Decimal Number
ValueCountFrequency (%)
3 44
17.0%
1 44
17.0%
2 42
16.2%
7 41
15.8%
6 21
8.1%
5 20
7.7%
8 14
 
5.4%
9 14
 
5.4%
4 10
 
3.9%
0 9
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 1117805
> 99.9%
Common 259
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 199889
17.9%
n 110349
9.9%
l 108648
9.7%
c 108264
9.7%
f 82971
 
7.4%
t 68644
 
6.1%
m 60888
 
5.4%
p 56632
 
5.1%
i 54410
 
4.9%
o 50032
 
4.5%
Other values (16) 217078
19.4%
Common
ValueCountFrequency (%)
3 44
17.0%
1 44
17.0%
2 42
16.2%
7 41
15.8%
6 21
8.1%
5 20
7.7%
8 14
 
5.4%
9 14
 
5.4%
4 10
 
3.9%
0 9
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1118064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 199889
17.9%
n 110349
9.9%
l 108648
9.7%
c 108264
9.7%
f 82971
 
7.4%
t 68644
 
6.1%
m 60888
 
5.4%
p 56632
 
5.1%
i 54410
 
4.9%
o 50032
 
4.5%
Other values (26) 217337
19.4%

condition
Real number (ℝ)

MISSING 

Distinct41
Distinct (%)< 0.1%
Missing11820
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean30.672365
Minimum1
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.3 MiB
2024-11-13T19:29:53.535881image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q123
median35
Q342
95-th percentile47
Maximum49
Range48
Interquartile range (IQR)19

Descriptive statistics

Standard deviation13.402832
Coefficient of variation (CV)0.43696767
Kurtosis-0.22342048
Mean30.672365
Median Absolute Deviation (MAD)8
Skewness-0.83295619
Sum16778305
Variance179.6359
MonotonicityNot monotonic
2024-11-13T19:29:53.600664image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
19 42281
 
7.6%
35 26750
 
4.8%
37 25938
 
4.6%
44 25514
 
4.6%
43 24937
 
4.5%
42 24328
 
4.4%
36 23144
 
4.1%
41 23073
 
4.1%
2 20790
 
3.7%
4 19922
 
3.6%
Other values (31) 290340
52.0%
ValueCountFrequency (%)
1 7364
 
1.3%
2 20790
3.7%
3 10803
1.9%
4 19922
3.6%
5 11222
2.0%
11 87
 
< 0.1%
12 95
 
< 0.1%
13 82
 
< 0.1%
14 134
 
< 0.1%
15 144
 
< 0.1%
ValueCountFrequency (%)
49 13099
2.3%
48 12712
2.3%
47 11363
2.0%
46 12634
2.3%
45 12313
2.2%
44 25514
4.6%
43 24937
4.5%
42 24328
4.4%
41 23073
4.1%
39 19920
3.6%

odometer
Real number (ℝ)

HIGH CORRELATION 

Distinct172278
Distinct (%)30.8%
Missing94
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean68320.018
Minimum1
Maximum999999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.3 MiB
2024-11-13T19:29:53.668441image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10512
Q128371
median52254
Q399109
95-th percentile170056.9
Maximum999999
Range999998
Interquartile range (IQR)70738

Descriptive statistics

Standard deviation53398.543
Coefficient of variation (CV)0.78159439
Kurtosis13.548941
Mean68320.018
Median Absolute Deviation (MAD)30475
Skewness1.8431998
Sum3.8173332 × 1010
Variance2.8514044 × 109
MonotonicityNot monotonic
2024-11-13T19:29:53.738207image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1318
 
0.2%
999999 72
 
< 0.1%
10 29
 
< 0.1%
21587 21
 
< 0.1%
21310 18
 
< 0.1%
29137 18
 
< 0.1%
2 18
 
< 0.1%
8 18
 
< 0.1%
36007 17
 
< 0.1%
33995 17
 
< 0.1%
Other values (172268) 557197
99.7%
(Missing) 94
 
< 0.1%
ValueCountFrequency (%)
1 1318
0.2%
2 18
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
5 17
 
< 0.1%
6 13
 
< 0.1%
7 13
 
< 0.1%
8 18
 
< 0.1%
9 11
 
< 0.1%
10 29
 
< 0.1%
ValueCountFrequency (%)
999999 72
< 0.1%
980113 1
 
< 0.1%
959276 1
 
< 0.1%
694978 2
 
< 0.1%
621388 1
 
< 0.1%
580956 1
 
< 0.1%
537334 1
 
< 0.1%
522212 1
 
< 0.1%
500227 1
 
< 0.1%
495757 1
 
< 0.1%

color
Categorical

HIGH CORRELATION 

Distinct46
Distinct (%)< 0.1%
Missing749
Missing (%)0.1%
Memory size33.6 MiB
black
110970 
white
106673 
silver
83389 
gray
82857 
blue
51139 
Other values (41)
123060 

Length

Max length9
Median length8
Mean length4.6275032
Min length1

Characters and Unicode

Total characters2582554
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)< 0.1%

Sample

1st rowwhite
2nd rowwhite
3rd rowgray
4th rowwhite
5th rowgray

Common Values

ValueCountFrequency (%)
black 110970
19.9%
white 106673
19.1%
silver 83389
14.9%
gray 82857
14.8%
blue 51139
9.2%
red 43569
 
7.8%
— 24685
 
4.4%
green 11382
 
2.0%
gold 11342
 
2.0%
beige 9222
 
1.7%
Other values (36) 22860
 
4.1%

Length

2024-11-13T19:29:53.805980image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
black 110970
19.9%
white 106673
19.1%
silver 83389
14.9%
gray 82857
14.8%
blue 51139
9.2%
red 43569
 
7.8%
— 24685
 
4.4%
green 11382
 
2.0%
gold 11342
 
2.0%
beige 9222
 
1.7%
Other values (36) 22860
 
4.1%

Most occurring characters

ValueCountFrequency (%)
e 332602
12.9%
l 261465
 
10.1%
r 241240
 
9.3%
i 201026
 
7.8%
a 196863
 
7.6%
b 187020
 
7.2%
g 125853
 
4.9%
w 116124
 
4.5%
c 111928
 
4.3%
k 111012
 
4.3%
Other values (25) 697421
27.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2556309
99.0%
Dash Punctuation 26134
 
1.0%
Decimal Number 111
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 332602
13.0%
l 261465
 
10.2%
r 241240
 
9.4%
i 201026
 
7.9%
a 196863
 
7.7%
b 187020
 
7.3%
g 125853
 
4.9%
w 116124
 
4.5%
c 111928
 
4.4%
k 111012
 
4.3%
Other values (13) 671176
26.3%
Decimal Number
ValueCountFrequency (%)
1 20
18.0%
8 14
12.6%
2 13
11.7%
7 12
10.8%
6 12
10.8%
3 9
8.1%
5 9
8.1%
0 8
 
7.2%
4 7
 
6.3%
9 7
 
6.3%
Dash Punctuation
ValueCountFrequency (%)
— 24685
94.5%
- 1449
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2556309
99.0%
Common 26245
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 332602
13.0%
l 261465
 
10.2%
r 241240
 
9.4%
i 201026
 
7.9%
a 196863
 
7.7%
b 187020
 
7.3%
g 125853
 
4.9%
w 116124
 
4.5%
c 111928
 
4.4%
k 111012
 
4.3%
Other values (13) 671176
26.3%
Common
ValueCountFrequency (%)
— 24685
94.1%
- 1449
 
5.5%
1 20
 
0.1%
8 14
 
0.1%
2 13
 
< 0.1%
7 12
 
< 0.1%
6 12
 
< 0.1%
3 9
 
< 0.1%
5 9
 
< 0.1%
0 8
 
< 0.1%
Other values (2) 14
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2557869
99.0%
Punctuation 24685
 
1.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 332602
13.0%
l 261465
 
10.2%
r 241240
 
9.4%
i 201026
 
7.9%
a 196863
 
7.7%
b 187020
 
7.3%
g 125853
 
4.9%
w 116124
 
4.5%
c 111928
 
4.4%
k 111012
 
4.3%
Other values (24) 672736
26.3%
Punctuation
ValueCountFrequency (%)
— 24685
100.0%

interior
Categorical

IMBALANCE 

Distinct17
Distinct (%)< 0.1%
Missing749
Missing (%)0.1%
Memory size33.2 MiB
black
244329 
gray
178581 
beige
59758 
tan
44093 
—
 
17077
Other values (12)
 
14250

Length

Max length9
Median length5
Mean length4.399437
Min length1

Characters and Unicode

Total characters2455273
Distinct characters23
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowblack
2nd rowbeige
3rd rowblack
4th rowblack
5th rowblack

Common Values

ValueCountFrequency (%)
black 244329
43.7%
gray 178581
32.0%
beige 59758
 
10.7%
tan 44093
 
7.9%
— 17077
 
3.1%
brown 8640
 
1.5%
red 1363
 
0.2%
blue 1143
 
0.2%
silver 1104
 
0.2%
off-white 480
 
0.1%
Other values (7) 1520
 
0.3%
(Missing) 749
 
0.1%

Length

2024-11-13T19:29:53.871761image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
black 244329
43.8%
gray 178581
32.0%
beige 59758
 
10.7%
tan 44093
 
7.9%
— 17077
 
3.1%
brown 8640
 
1.5%
red 1363
 
0.2%
blue 1143
 
0.2%
silver 1104
 
0.2%
off-white 480
 
0.1%
Other values (7) 1520
 
0.3%

Most occurring characters

ValueCountFrequency (%)
a 467148
19.0%
b 314061
12.8%
l 247279
10.1%
c 244329
10.0%
k 244329
10.0%
g 239244
9.7%
r 190608
7.8%
y 178792
 
7.3%
e 124856
 
5.1%
i 61598
 
2.5%
Other values (13) 143029
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2437716
99.3%
Dash Punctuation 17557
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 467148
19.2%
b 314061
12.9%
l 247279
10.1%
c 244329
10.0%
k 244329
10.0%
g 239244
9.8%
r 190608
7.8%
y 178792
 
7.3%
e 124856
 
5.1%
i 61598
 
2.5%
Other values (11) 125472
 
5.1%
Dash Punctuation
ValueCountFrequency (%)
— 17077
97.3%
- 480
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 2437716
99.3%
Common 17557
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 467148
19.2%
b 314061
12.9%
l 247279
10.1%
c 244329
10.0%
k 244329
10.0%
g 239244
9.8%
r 190608
7.8%
y 178792
 
7.3%
e 124856
 
5.1%
i 61598
 
2.5%
Other values (11) 125472
 
5.1%
Common
ValueCountFrequency (%)
— 17077
97.3%
- 480
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2438196
99.3%
Punctuation 17077
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 467148
19.2%
b 314061
12.9%
l 247279
10.1%
c 244329
10.0%
k 244329
10.0%
g 239244
9.8%
r 190608
7.8%
y 178792
 
7.3%
e 124856
 
5.1%
i 61598
 
2.5%
Other values (12) 125952
 
5.2%
Punctuation
ValueCountFrequency (%)
— 17077
100.0%

seller
Text

Distinct14263
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size42.6 MiB
2024-11-13T19:29:54.121279image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length50
Median length42
Mean length22.990296
Min length3

Characters and Unicode

Total characters12847828
Distinct characters47
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4949 ?
Unique (%)0.9%

Sample

1st rowkia motors america inc
2nd rowkia motors america inc
3rd rowfinancial services remarketing (lease)
4th rowvolvo na rep/world omni
5th rowfinancial services remarketing (lease)
ValueCountFrequency (%)
inc 86910
 
4.6%
services 48242
 
2.6%
corporation 47851
 
2.5%
auto 47453
 
2.5%
credit 46960
 
2.5%
motor 45807
 
2.4%
llc 45554
 
2.4%
financial 44151
 
2.3%
ford 36212
 
1.9%
remarketing 35475
 
1.9%
Other values (8580) 1395315
74.2%
2024-11-13T19:29:54.353012image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1339805
 
10.4%
e 1146348
 
8.9%
a 1052365
 
8.2%
r 962260
 
7.5%
n 953457
 
7.4%
i 917323
 
7.1%
o 863204
 
6.7%
t 796237
 
6.2%
c 736366
 
5.7%
s 671384
 
5.2%
Other values (37) 3409079
26.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11300229
88.0%
Space Separator 1339805
 
10.4%
Other Punctuation 148773
 
1.2%
Dash Punctuation 35787
 
0.3%
Decimal Number 10257
 
0.1%
Close Punctuation 6484
 
0.1%
Open Punctuation 6484
 
0.1%
Math Symbol 9
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1146348
10.1%
a 1052365
9.3%
r 962260
 
8.5%
n 953457
 
8.4%
i 917323
 
8.1%
o 863204
 
7.6%
t 796237
 
7.0%
c 736366
 
6.5%
s 671384
 
5.9%
l 574545
 
5.1%
Other values (16) 2626740
23.2%
Decimal Number
ValueCountFrequency (%)
2 3079
30.0%
1 2187
21.3%
0 1351
13.2%
9 771
 
7.5%
5 682
 
6.6%
8 615
 
6.0%
4 607
 
5.9%
3 440
 
4.3%
6 355
 
3.5%
7 170
 
1.7%
Other Punctuation
ValueCountFrequency (%)
/ 105904
71.2%
. 28458
 
19.1%
& 9444
 
6.3%
' 2874
 
1.9%
# 2090
 
1.4%
: 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1339805
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 35787
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6484
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6484
100.0%
Math Symbol
ValueCountFrequency (%)
+ 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11300229
88.0%
Common 1547599
 
12.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1146348
10.1%
a 1052365
9.3%
r 962260
 
8.5%
n 953457
 
8.4%
i 917323
 
8.1%
o 863204
 
7.6%
t 796237
 
7.0%
c 736366
 
6.5%
s 671384
 
5.9%
l 574545
 
5.1%
Other values (16) 2626740
23.2%
Common
ValueCountFrequency (%)
1339805
86.6%
/ 105904
 
6.8%
- 35787
 
2.3%
. 28458
 
1.8%
& 9444
 
0.6%
) 6484
 
0.4%
( 6484
 
0.4%
2 3079
 
0.2%
' 2874
 
0.2%
1 2187
 
0.1%
Other values (11) 7093
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12847828
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1339805
 
10.4%
e 1146348
 
8.9%
a 1052365
 
8.2%
r 962260
 
7.5%
n 953457
 
7.4%
i 917323
 
7.1%
o 863204
 
6.7%
t 796237
 
6.2%
c 736366
 
5.7%
s 671384
 
5.2%
Other values (37) 3409079
26.5%

mmr
Real number (ℝ)

HIGH CORRELATION 

Distinct1101
Distinct (%)0.2%
Missing38
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean13769.377
Minimum25
Maximum182000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.3 MiB
2024-11-13T19:29:54.423774image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile1800
Q17100
median12250
Q318300
95-th percentile30600
Maximum182000
Range181975
Interquartile range (IQR)11200

Descriptive statistics

Standard deviation9679.9672
Coefficient of variation (CV)0.70300688
Kurtosis11.443328
Mean13769.377
Median Absolute Deviation (MAD)5575
Skewness1.9976441
Sum7.6943144 × 109
Variance93701764
MonotonicityNot monotonic
2024-11-13T19:29:54.492544image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12500 1761
 
0.3%
11600 1751
 
0.3%
11650 1746
 
0.3%
12150 1722
 
0.3%
11850 1717
 
0.3%
11300 1716
 
0.3%
11750 1709
 
0.3%
12350 1702
 
0.3%
12700 1701
 
0.3%
11950 1694
 
0.3%
Other values (1091) 541580
96.9%
ValueCountFrequency (%)
25 30
 
< 0.1%
50 44
< 0.1%
75 23
 
< 0.1%
100 33
 
< 0.1%
125 40
< 0.1%
150 45
< 0.1%
175 69
< 0.1%
200 54
< 0.1%
225 60
< 0.1%
250 83
< 0.1%
ValueCountFrequency (%)
182000 1
 
< 0.1%
178000 1
 
< 0.1%
176000 1
 
< 0.1%
172000 1
 
< 0.1%
170000 3
< 0.1%
166000 3
< 0.1%
164000 1
 
< 0.1%
163000 1
 
< 0.1%
162000 1
 
< 0.1%
161000 1
 
< 0.1%

sellingprice
Real number (ℝ)

HIGH CORRELATION 

Distinct1887
Distinct (%)0.3%
Missing12
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean13611.359
Minimum1
Maximum230000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.3 MiB
2024-11-13T19:29:54.560318image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1500
Q16900
median12100
Q318200
95-th percentile30600
Maximum230000
Range229999
Interquartile range (IQR)11300

Descriptive statistics

Standard deviation9749.5016
Coefficient of variation (CV)0.71627688
Kurtosis11.114646
Mean13611.359
Median Absolute Deviation (MAD)5650
Skewness1.9534444
Sum7.6063676 × 109
Variance95052782
MonotonicityNot monotonic
2024-11-13T19:29:54.625101image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11000 4453
 
0.8%
12000 4450
 
0.8%
13000 4334
 
0.8%
10000 4029
 
0.7%
14000 3899
 
0.7%
11500 3876
 
0.7%
12500 3714
 
0.7%
9000 3689
 
0.7%
10500 3540
 
0.6%
15000 3386
 
0.6%
Other values (1877) 519455
93.0%
ValueCountFrequency (%)
1 4
 
< 0.1%
100 19
 
< 0.1%
125 1
 
< 0.1%
150 21
 
< 0.1%
175 10
 
< 0.1%
200 196
 
< 0.1%
225 105
 
< 0.1%
250 281
 
0.1%
275 124
 
< 0.1%
300 1282
0.2%
ValueCountFrequency (%)
230000 1
< 0.1%
183000 1
< 0.1%
173000 1
< 0.1%
171500 1
< 0.1%
169500 1
< 0.1%
169000 1
< 0.1%
167000 1
< 0.1%
165000 2
< 0.1%
163000 2
< 0.1%
161000 1
< 0.1%
Distinct3766
Distinct (%)0.7%
Missing12
Missing (%)< 0.1%
Memory size51.2 MiB
2024-11-13T19:29:54.769620image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length39
Median length39
Mean length38.998409
Min length4

Characters and Unicode

Total characters21793286
Distinct characters40
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique604 ?
Unique (%)0.1%

Sample

1st rowTue Dec 16 2014 12:30:00 GMT-0800 (PST)
2nd rowTue Dec 16 2014 12:30:00 GMT-0800 (PST)
3rd rowThu Jan 15 2015 04:30:00 GMT-0800 (PST)
4th rowThu Jan 29 2015 04:30:00 GMT-0800 (PST)
5th rowThu Dec 18 2014 12:30:00 GMT-0800 (PST)
ValueCountFrequency (%)
2015 505072
 
12.9%
gmt-0800 395489
 
10.1%
pst 395489
 
10.1%
wed 166069
 
4.2%
tue 163950
 
4.2%
pdt 163310
 
4.2%
gmt-0700 163310
 
4.2%
feb 163053
 
4.2%
thu 153750
 
3.9%
jan 140815
 
3.6%
Other values (334) 1501312
38.4%
2024-11-13T19:29:54.980914image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4819691
22.1%
3352794
15.4%
T 1435298
 
6.6%
: 1117598
 
5.1%
1 1061408
 
4.9%
2 962306
 
4.4%
M 673292
 
3.1%
5 660463
 
3.0%
G 558799
 
2.6%
) 558799
 
2.6%
Other values (30) 6592838
30.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8940909
41.0%
Uppercase Letter 4470392
20.5%
Space Separator 3352794
 
15.4%
Lowercase Letter 2235196
 
10.3%
Other Punctuation 1117598
 
5.1%
Close Punctuation 558799
 
2.6%
Open Punctuation 558799
 
2.6%
Dash Punctuation 558799
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 546592
24.5%
u 419079
18.7%
n 256663
11.5%
a 239546
10.7%
d 166069
 
7.4%
b 163053
 
7.3%
h 153750
 
6.9%
r 106839
 
4.8%
i 59112
 
2.6%
c 53520
 
2.4%
Other values (5) 70973
 
3.2%
Decimal Number
ValueCountFrequency (%)
0 4819691
53.9%
1 1061408
 
11.9%
2 962306
 
10.8%
5 660463
 
7.4%
8 471020
 
5.3%
3 395580
 
4.4%
7 235904
 
2.6%
4 189724
 
2.1%
9 78907
 
0.9%
6 65906
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
T 1435298
32.1%
M 673292
15.1%
G 558799
 
12.5%
P 558799
 
12.5%
S 395638
 
8.9%
J 242052
 
5.4%
F 222165
 
5.0%
D 216830
 
4.9%
W 166069
 
3.7%
A 1450
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3352794
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1117598
100.0%
Close Punctuation
ValueCountFrequency (%)
) 558799
100.0%
Open Punctuation
ValueCountFrequency (%)
( 558799
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 558799
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15087698
69.2%
Latin 6705588
30.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 1435298
21.4%
M 673292
10.0%
G 558799
 
8.3%
P 558799
 
8.3%
e 546592
 
8.2%
u 419079
 
6.2%
S 395638
 
5.9%
n 256663
 
3.8%
J 242052
 
3.6%
a 239546
 
3.6%
Other values (15) 1379830
20.6%
Common
ValueCountFrequency (%)
0 4819691
31.9%
3352794
22.2%
: 1117598
 
7.4%
1 1061408
 
7.0%
2 962306
 
6.4%
5 660463
 
4.4%
) 558799
 
3.7%
( 558799
 
3.7%
- 558799
 
3.7%
8 471020
 
3.1%
Other values (5) 966021
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21793286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4819691
22.1%
3352794
15.4%
T 1435298
 
6.6%
: 1117598
 
5.1%
1 1061408
 
4.9%
2 962306
 
4.4%
M 673292
 
3.1%
5 660463
 
3.0%
G 558799
 
2.6%
) 558799
 
2.6%
Other values (30) 6592838
30.3%

Interactions

2024-11-13T19:29:49.022344image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.140131image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.579733image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.133900image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.581404image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:49.107058image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.231829image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.674920image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.229579image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.673097image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:49.196758image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.321236image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.768113image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.318284image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.760804image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:49.295485image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.407796image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.940043image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.403997image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.844931image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:49.378208image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:47.492520image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.040214image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.496687image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T19:29:48.938617image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Correlations

2024-11-13T19:29:55.034737image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
colorconditioninteriormmrodometersellingpricetransmissionyear
color1.0000.0560.1020.0560.0660.0480.8180.091
condition0.0561.0000.0570.427-0.4040.4800.0310.387
interior0.1020.0571.0000.0630.0910.0620.0640.109
mmr0.0560.4270.0631.000-0.7180.9790.0270.697
odometer0.066-0.4040.091-0.7181.000-0.7050.016-0.817
sellingprice0.0480.4800.0620.979-0.7051.0000.0070.679
transmission0.8180.0310.0640.0270.0160.0071.0000.052
year0.0910.3870.1090.697-0.8170.6790.0521.000

Missing values

2024-11-13T19:29:49.535208image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-13T19:29:49.981733image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-13T19:29:50.869907image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

yearmakemodeltrimbodytransmissionvinstateconditionodometercolorinteriorsellermmrsellingpricesaledate
02015KiaSorentoLXSUVautomatic5xyktca69fg566472ca5.016639.0whiteblackkia motors america inc20500.021500.0Tue Dec 16 2014 12:30:00 GMT-0800 (PST)
12015KiaSorentoLXSUVautomatic5xyktca69fg561319ca5.09393.0whitebeigekia motors america inc20800.021500.0Tue Dec 16 2014 12:30:00 GMT-0800 (PST)
22014BMW3 Series328i SULEVSedanautomaticwba3c1c51ek116351ca45.01331.0grayblackfinancial services remarketing (lease)31900.030000.0Thu Jan 15 2015 04:30:00 GMT-0800 (PST)
32015VolvoS60T5Sedanautomaticyv1612tb4f1310987ca41.014282.0whiteblackvolvo na rep/world omni27500.027750.0Thu Jan 29 2015 04:30:00 GMT-0800 (PST)
42014BMW6 Series Gran Coupe650iSedanautomaticwba6b2c57ed129731ca43.02641.0grayblackfinancial services remarketing (lease)66000.067000.0Thu Dec 18 2014 12:30:00 GMT-0800 (PST)
52015NissanAltima2.5 SSedanautomatic1n4al3ap1fn326013ca1.05554.0grayblackenterprise vehicle exchange / tra / rental / tulsa15350.010900.0Tue Dec 30 2014 12:00:00 GMT-0800 (PST)
62014BMWM5BaseSedanautomaticwbsfv9c51ed593089ca34.014943.0blackblackthe hertz corporation69000.065000.0Wed Dec 17 2014 12:30:00 GMT-0800 (PST)
72014ChevroletCruze1LTSedanautomatic1g1pc5sb2e7128460ca2.028617.0blackblackenterprise vehicle exchange / tra / rental / tulsa11900.09800.0Tue Dec 16 2014 13:00:00 GMT-0800 (PST)
82014AudiA42.0T Premium Plus quattroSedanautomaticwauffafl3en030343ca42.09557.0whiteblackaudi mission viejo32100.032250.0Thu Dec 18 2014 12:00:00 GMT-0800 (PST)
92014ChevroletCamaroLTConvertibleautomatic2g1fb3d37e9218789ca3.04809.0redblackd/m auto sales inc26300.017500.0Tue Jan 20 2015 04:00:00 GMT-0800 (PST)
yearmakemodeltrimbodytransmissionvinstateconditionodometercolorinteriorsellermmrsellingpricesaledate
5588272014JeepGrand CherokeeLaredoSUVautomatic1c4rjfag0ec466276pa42.025180.0grayblackhertz corporation/gdp26000.024500.0Tue Jul 07 2015 06:30:00 GMT-0700 (PDT)
5588282012DodgeGrand CaravanAmerican Value PackageMinivanautomatic2c4rdgbg1cr349287ma37.097036.0silvergrayge fleet services for itself/servicer8300.07800.0Tue Jul 07 2015 06:30:00 GMT-0700 (PDT)
5588292012HyundaiElantraLimitedSedanNaN5npdh4ae7ch106397pa4.066720.0graygraychampion mazda10250.010400.0Wed Jul 08 2015 07:30:00 GMT-0700 (PDT)
5588302012NissanSentra2.0 SRSedanNaN3n1ab6ap3cl622485tn26.035858.0whitegraynissan-infiniti lt9950.010400.0Wed Jul 08 2015 17:15:00 GMT-0700 (PDT)
5588312011BMW5 Series528iSedanautomaticwbafr1c53bc744672fl39.066403.0whitebrownlauderdale imports ltd bmw pembrok pines20300.022800.0Tue Jul 07 2015 06:15:00 GMT-0700 (PDT)
5588322015KiaK900LuxurySedanNaNknalw4d4xf6019304in45.018255.0silverblackavis corporation35300.033000.0Thu Jul 09 2015 07:00:00 GMT-0700 (PDT)
5588332012Ram2500Power WagonCrew Cabautomatic3c6td5et6cg112407wa5.054393.0whiteblacki -5 uhlmann rv30200.030800.0Wed Jul 08 2015 09:30:00 GMT-0700 (PDT)
5588342012BMWX5xDrive35dSUVautomatic5uxzw0c58cl668465ca48.050561.0blackblackfinancial services remarketing (lease)29800.034000.0Wed Jul 08 2015 09:30:00 GMT-0700 (PDT)
5588352015NissanAltima2.5 Ssedanautomatic1n4al3ap0fc216050ga38.016658.0whiteblackenterprise vehicle exchange / tra / rental / tulsa15100.011100.0Thu Jul 09 2015 06:45:00 GMT-0700 (PDT)
5588362014FordF-150XLTSuperCrewautomatic1ftfw1et2eke87277ca34.015008.0graygrayford motor credit company llc pd29600.026700.0Thu May 28 2015 05:30:00 GMT-0700 (PDT)